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FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN 
SINGLE-STEP MULTIPLE TESTING PROCEDURES^ 

By San at K. Sarkar 

Temple University 

Results on the false discovery rate (FDR) and the false nondiscov- 
ery rate (FNR) are developed for single-step multiple testing proce- 
dures. In addition to verifying desirable properties of FDR and FNR 
as measures of error rates, these results extend previously known re- 
sults, providing further insights, particularly under dependence, into 
the notions of FDR and FNR and related measures. First, consider- 
ing fixed configurations of true and false null hypotheses, inequalities 
are obtained to explain how an FDR- or FNR-controlling single-step 
procedure, such as a Bonferroni or Sidak procedure, can potentially 
be improved. Two families of procedures are then constructed, one 
that modifies the FDR-controlling and the other that modifies the 
FNR-controlling Sidak procedure. These are proved to control FDR 
or FNR under independence less conservatively than the correspond- 
ing families that modify the FDR- or FNR-controlling Bonferroni 
procedure. Results of numerical investigations of the performance of 
the modified Sidak FDR procedure over its competitors are presented. 
Second, considering a mixture model where different configurations 
of true and false null hypotheses are assumed to have certain proba- 
bilities, results are also derived that extend some of Storey's work to 
the dependence case. 

1. Introduction. The false discovery rate (FDR) and related measures 
have been receiving considerable attention due to their relevance as mea- 
sures of the overall error rate in multiple testing problems that arise in 
many scientific investigations, particularly in the context of DNA microar- 
ray analysis. Consider Table 1, which summarizes the outcomes in multiple 
testing of n null hypotheses Hi, . . . , Let Q = V/R if ii > and = if 
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Table 1 

The outcomes in testing n null hypotheses 





Rejected 


Accepted 


Total 


True null 


V 


U 


no 


False null 


s 


T 


ni 


Total 


R 


A 


n 



R = 0, that is, the proportion of false positives (Type I errors) among the 
rejected null hypotheses. Genovese and Wasserman [9] called this the false 
discovery proportion (FDP). The FDR is defined by E{Q). It was first in- 
troduced in multiple testing by Benjamini and Hochberg [1], who provided 
a step- up procedure that controls the FDR with independent test statistics. 
Later, Benjamini and Liu [4] offered a step-down FDR procedure under inde- 
pendence. The FDR-controlling property of the Benjamini-Hochberg (BH) 
procedure was extended by Benjamini and Yekutieli [5] to some positively 
dependent multivariate distributions. Sarkar [14] proved that the critical val- 
ues of the BH procedure can be used in a more general stepwise procedure 
to provide control of the FDR not only under independence, but also when 
the test statistics have the same type of positive dependence property as 
considered by Benjamini and Yekutieli [5]. In addition, he established the 
FDR-controlling property of the Benjamini-Liu step-down procedure for 
some positively dependent test statistics. Genovese and Wasserman [8, 9] 
investigated some operating characteristics of the BH procedure asymptoti- 
cally under independence and further extended the theory of FDR by taking 
a stochastic process approach. 

A slightly different concept of FDR, called the positive false discovery 
rate (pFDR), was considered by Storey [17]. It is defined as the conditional 
FDR given at least one rejection, that is, pFDR = E{V/R\R > 0), and it 
has the interpretation of a Bayesian Type I error rate under a mixture 
model involving i.i.d. p-values when a single-step multiple testing procedure 
is used; see also [18]. Storey [17] provided estimates of FDR and pFDR 
under the above mixture model for a single-step procedure that are related 
to the empirical Bayes FDR of Efron, Tibshirani, Storey and Tusher [7]; see 
also [6]. A new family of FDR procedures based on estimates of FDR was 
suggested by Storey [17] and Storey, Taylor and Siegmund [19]. 

An analog of FDR in terms of false negatives (Type II errors) was intro- 
duced by Genovese and Wasserman [8] and Sarkar [15]. It is the FNR, called 
false nondiscovery rate by Genovese and Wasserman [8] and the false nega- 
tives rate by Sarkar [15]. It is defined by E{N), where N = T/A if ^ > and 
= if ^ = is the proportion of false negatives among the accepted null hy- 
potheses or the false nondiscovery proportion (FNP) [9]. Storey [18] defined 
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the pFNR (positive false nondiscovery rate), the conditional expectation 
E{T/A\A > 0), as an analog of his pFDR. While Genovese and Wasserman 
[8] considered new methods that incorporate both FDR and FNR, Storey 
[18] established a connection between multiple testing and classification the- 
ory in terms of a combination of pFDR and pFNR. Sarkar [15] proved that 
the FNR can be controlled by a step-down analog of the BH procedure. He 
also introduced a concept of unbiasedness of an FDR- or FNR-controlling 
multiple testing procedure and established this property for a generalized 
stepwise procedure under independence. 

In this article we mainly concentrate on single-step multiple testing pro- 
cedures, and we develop new results on FDR and FNR with dependent test 
statistics both under a model where the configuration of true and false null 
hypotheses is assumed fixed, yet unknown, and under the so-called mixture 
model where different configurations of true and false null hypotheses are 
assumed to have certain probabilities. The intent of these results is to verify 
some desirable properties of FDR and FNR and to extend some previously 
known results, thereby providing further insights into the notions of FDR 
and FNR and related measures, particularly under dependence. 

Suppose that X = (Xi, . . . , X„) has a joint distribution indexed by the set 
of parameters = {9i, . . . , On) - Let Hi : 6i < 6iQ be tested against Ki:9i> 9iQ, 
for some given 6iQ, i = 1, . . . ,n. Let {Hi : i £ Jq} and {Hi : i € Ji} be the sets 
of true and false null hypotheses, respectively. It will be assumed that Jo is 
nonempty. Consider a single-step procedure that rejects Hi in favor of Ki if 
Xi>t for some fixed t. Two of our main results with fixed Jq and Ji (The- 
orems 1 and 3) are that if X is stochastically increasing in each 9i, which 
is typically the case in many multiple testing problems, then the maximum 
values of FDR and FNR of a single-step procedure are {nQ/n)P{R > 0} 
and {ni/n)P{A > 0}, respectively, where the probabilities are evaluated at 
9o = {6io, . ■ . ,^no) and X is assumed exchangeable under these null hypoth- 
esis values. In addition to representing more precise versions of the results 
that state that Sidak and Bonferroni single-step procedures control FDR 
or FNR, these theorems show how these procedures can potentially be im- 
proved in terms of having better control of FDR or FNR borrowing informa- 
tion about uq or rii from the data in the spirit of Benjamini and Hochberg 
[2], Benjamini, Krieger and Yekutieli [3], Storey [17] and Storey, Taylor and 
Siegmund [19]. Storey, Taylor and Siegmund [19] provided procedures for 
modifying the BH procedure using a family of estimates of no and proved 
that they control FDR under independence. We obtain new families of pro- 
cedures: one to modify the FDR-controlling and the other to modify the 
FNR-controlling Sidak procedure. Considering independent test statistics, 
we prove that they control FDR or FNR. The modified Sidak FDR pro- 
cedures are less conservative under independence than the corresponding 
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family that modifies the Bonferroni procedure obtained by using the es- 
timates of no considered in [19]. An analogous result is true for modified 
Sidak FNR procedures. Our method of modifying the Sidak FDR and the 
Sidak FNR procedures relies directly on two new results, Theorems 2 and 
4, which extend inequalities given by Theorems 1 and 3, respectively, under 
independence from a single-step to a two-step procedure. 

Next, we derive certain results that extend Storey's [17, 18] work to 
the dependent case. Storey obtained expressions for the FDR and FNR 
of a single-step procedure under a mixture model where, given any con- 
figuration of true and false null hypotheses, the Xj's are assumed to be 
independent, providing useful Bayesian interpretations to his notions of 
pFDR and pFNR. More specifically, he proved: pFDR = P{Hi is true|Xi > 
t} and pFNR = P{Hi is false |Xi < t}, irrespective of the number of tests. 
Assuming a more general mixture model in which the XiS are assumed 
to be dependent with a location family of distributions and to have a cer- 
tain type of positive dependence structure, we prove in Theorems 5 and 
6, respectively, that pFDR < maxi<,<„P{iJj is true|Xj > t} and pFNR < 
maxi<j<„, is false|X, < t}, with the equalities holding under indepen- 

dence. An important implication of the first inequality is that Storey's [17] 
g-value for a single-step multiple test under certain commonly encountered 
types of dependence is more conservative, as one would desire, than that 
under independence. 

The paper is organized as follows. In Section 2 we formally define the 
stochastic increasing property we need for X to obtain the maximum values 
of FDR and FNR for fixed Jq and Ji. Section 3 reports the results related 
to FDR for fixed Jq and Ji, and some numerical results that show the 
performance of the modified Sidak procedure in controlling FDR compared 
to the modified Bonferroni and the original Bonferroni and Sidak procedures. 
Similar results related to FNR are presented in Section 4, of course without 
showing any additional numerical evidence. Section 5 numerically compares 
the Bonferroni and Sidak procedures with their modified versions in terms 
of a concept of power involving both FDR and FNR. Section 6 presents the 
results on FDR and FNR under the aforementioned mixture model with 
dependent X. Proofs are given in Section 7. The paper concludes with some 
final remarks in Section 8. 

2. Stochastically increasing family of distributions. This section defines 
a type of stochastic increasing property of a family of distributions that will 
be required to establish our results on FDR and FNR. Whenever an increas- 
ing or decreasing condition or property in terms of X or is mentioned, it 
is to be understood as being coordinatewise. 
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Definition 1. An n-dimensional random vector X = (Xi,...,X„) or 
the corresponding family of distributions {Pe}, where 6 = {6i, . . . ,6n), is 
said to be stochastically increasing in 9 if Pg{X. G C} is increasing in 9 for 
any set C that is increasing. 

Example 1 (Random variables with mixtures of independent stochas- 
tically increasing distributions). In multiple testing, the Xj's often have 
distributions that are mixtures of independent stochastically increasing dis- 
tributions. That is, the density of Pg is often of the form 

/n 
Y[fie,ixi,y) dG{y), 
1=1 

where fiei{x,y) is stochastically increasing in 9i for each y and G is a prob- 
ability distribution independent oi 9. A stronger condition — which is that 
for any 9i < 9[, fig'{x,y)/ fig^{x,y) is increasing in x for each i, the mono- 
tone likelihood ratio (MLR) condition of Lehmann [12] satisfied by many of 
the commonly used distributions — is often useful to check for the stochastic 
increasing property of fiei{x,y) in 9i. The multivariate distribution of such 
random variables is stochastically increasing in 9. 

Example 2 (Multivariate location family of distributions). Let the den- 
sity of Pe be of the form /6)(x) = /(x — 9). Distributions of this type are 
stochastically increasing. This is because, for any 9 <9' , we have 

Pe'{X eC} = Pe{X €C-{9'- 9)} > Pe{X e C}. 

Many of the distributions that arise in multiple testing are of the type in 
Example 1 or 2. For instance, (i) independent normals with 0j's representing 
the means, (ii) absolute values of independent normals with 0j's representing 
the absolute means, (iii) independent chi-squares where 9i's are the scale 
parameters or (iv) scaled mixtures of all these distributions, are of the type 
in Example 1. They arise in simultaneous testing of means or variances of 
independent normals against one- or two-sided alternatives. Multivariate 
InF that arises in many-to-one comparisons of variances against one-sided 
alternatives is another distribution of the type in Example 1. Multivariate 
normal and multivariate t are distributions of the type in Example 2, arising, 
for instance, in Dunnett's many-to-one comparisons of means against one- 
sided alternatives in a one-way layout with a known or unknown common 
variance. 
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3. Results on FDR for fixed Jq and Ji . In this section we derive results 
on the FDR of a single-step procedure, assuming fixed, but unknown, Jq and 
Ji. We use the following notation here and in the rest of the paper. Define 
J = {1, . . . , n} and J(-i) = J — {i}- Define < • • • < as the ordered 



components of the set {Xj : j G J} and -'^n < ■ • ■ < X'>^^'-^^ as those of the 



(1) 



subset {Xj : j G J{-i)}- We assume that the marginal distribution of any Xi 
depends on 9 only through the corresponding 0j. 
First, we have the following lemma. 



Lemma 1. The FDR of the single-step procedure with fixed critical value 
t is given by 

FBRgit;Jo,Ji] 



(3.1) 



E 

«e Jo L 



PeAX^>t}-Y^ 



^~^Pe{xl.;^ >t,Xi>t} 



{n-j){n-j + 1) 



-Pe{X^n)>t}-Y. 

ieJi 



n-l 



Peixl-I'^ >t,Xi>t} 
{n-j){n-j + l) 



PeAX^>t}-Y^ 



Now suppose that X is stochastically increasing in 9. Then, since the set 
>t,Xi> t} is increasing in X, the probability > t,X.i > t} 

is increasing in 9. The probability P^jX^^^ > t} is also increasing in 9 because 
{^{n) ^ t} is an increasing set. Thus, using the first expression of the FDR 
in (3.1), we notice that it is decreasing in 9 and, hence, in {9i:i € Ji} for 
fixed {9i :i G Jq}, whereas from the the second expression we see that it is 
increasing in {9i:i G Jq} for fixed {0j : i G Ji}. In other words, FDRg{t; Jq, Ji) 
decreases as 9i moves away from 9iQ for at least one i G Jq or at least one 
i G Ji, with 



(3.2) 



sup FDRe (t; Jo , Ji ) = FDRg, (t ; Jo , Ji ) , 



where ^o = (^lOi • • • j dno)- If ^ is exchangeable when 9 = 9q with the common 
marginal c.d.f. Fq, the right-hand side of (3.2) reduces to 



no 



(3.3) 



= ^FDR9,(t;J,</>) 
n 



where Fq = 1 — Fq and (j) represents a null set. Thus, we have the following 
theorem, which is one of the main results of this article. 
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Theorem 1. i/X is stochastically increasing in 9, then FDR0(t, Jq, Ji) 
decreases as 6i moves away from Oiq for at least one i € Jo or for at least 
one i G Ji- Furthermore, if^ is exchangeable when 6 = 6q, then 

(3.4) supFDR9(t; Jq, Ji) = —Pe^R > 0}- 

e n 

Theorem 5.3 of [5] gives the above decreasing property of FDR with 
respect to only {Oi^i € Ji} under the assumptions that G Jq} and 

{Xi,i € Ji} are jointly independent and {Xi,i € Ji} is stochastically increas- 
ing in {9i,i G Ji}. Theorem 1 is a version of this for single-step procedures 
with dependent X and one-sided null hypotheses. 

As a corollary to Theorem 1, if the critical value t provides a level a test 
for the overall null hypothesis nr=i-^«i that is, if t satisfies PeaiR > 0} = 
P0^{Ta.a'Ki^j Xi > t} < a, then we have 

(3.5) FDR0(t;Jo,Ji)<— a, 

n 

implying that the FDR is controlled at a. Inequality (3.5) is interesting in 
that it represents a single-step analog of the same inequality known to hold 
for stepwise procedures with Simes [16] critical values providing an a-level 
test for ^^iHi [1, 5, 14]. Regarding the choice for t, if one does not want 
to utilize the distributional form of X or if it is unknown, the Bonferroni 
critical value that satisfies 

(3.6) Fo(t) = l-- 

n 

can be used. If, however, X is known to be positively dependent so that 
the inequality PgQjmaxjgjXj <t}> FQ(t) holds under the null hypothesis 
values with the equality holding under independence, as in the case of many 
distributions that arise in multiple testing, the Sidak critical value t that 
satisfies the equation 

(3.7) Fo(t) = (l -«)!/" 

offers a less conservative choice. 

We should point out that there is no surprise that the Bonferroni and 
Sidak single-step procedures control FDR, because they are known to con- 
trol the family wise error rate (FWER). It is also known that, given ng, it 
can be incorporated in the Bonferroni and other procedures to improve their 
FWER control [10]. What is new here is that Bonferroni and Sidak proce- 
dures can be further improved in terms of having better control of FDR using 
an estimate of no, in the spirit of Benjamin! and Hochberg [2], Benjamini, 
Krieger and Yekutieli [3], Storey [17] and Storey, Taylor and Siegmund [19]. 
For instance, since supg FDR5i(t; Jq, Ji) < no{l — -Fo(t)}, as we see from The- 
orem 1, rather than controlling n{l — Fo^t)}, which the Bonferroni method 
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does, a better control of FDR can be achieved if we control no{l — -^o(^)} 
for some appropriately chosen estimate no of no- To estimate no, Storey [17] 
suggested using the ratio Kr/FQ{T), where = Y^^=i^{^i < ''")> fo'^ some 
well-chosen r. However, Storey, Taylor and Siegmund [19] slightly modified 
it and used 

(3-8) no(r) = 

to obtain a new class of BH-type FDR-controlling procedures under inde- 
pendence. We use this no in our modification to the Bonferroni procedure. 
Also, the XiS that are small compared to r should not be declared large 
when modified Bonferroni is used. Thus, our modified Bonferroni procedure 
rejects Hi whenever 

(3.9) x,>r...[r.F„-^{l-f^)]. 

We prove later in this section that our modified Bonferroni procedure 
controls FDR under independence and we provide numerical evidence show- 
ing that quite often this control can be achieved much less conservatively. 
However, when X is known to be independent or at least positively depen- 
dent, a modification to the Sidak procedure is expected to produce a better 
performing procedure than the modified Bonferroni procedure. So, we first 
modify the Sidak procedure. The following theorem suggests how the idea of 
modifying the Bonferroni procedure can be extended to that for the Sidak 
procedure. It extends the inequality for the FDR under independence, given 
by Theorem 1, from a single-step to a two-step procedure that, for some 
fixed r € (— oo, oo) and a predetermined function tr{k) >t, A; = 0, 1, . . . , n, 
first finds k = maxo<i<„{i < t} (note that X(o) = — oo), then rejects all 
Hi for which Xi > tr{k). 

Theorem 2. Let X be independent with the distribution of Xi, in- 
dexed by the parameter 9i, belonging to an MLR family and having identical 
marginals when 6 = 6q. Then, for a two-step procedure with tr^k) > t, for 
all k = 0,1, ... ,n, the FDR satisfies the inequality 

FDRf (t,>r; Jo,Ji) 



(3.10) <Fo(r)EE 



Jo K=0 



Fo{tr{k)) V 
Fo{r) ) 

xP4^t;^<r<4-:i)} 



(with = — oo and ^(„) = oo). 
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When r = — oo, k = with probabihty 1 and (3.10) reduces to the one 
given by Theorem 1 under independence with t = t-oo{0). It is interesting 

to see that FDRi^^(t^ > r; Jo, Ji) < FDR0(r; Jo, Ji). 

The modified Bonferroni procedure is a two-step procedure with tr{k) 
given by the right-hand side of (3.9) given = k; that is, tr{k) is such that 
Fo{tr{k)) = min{Fo(r),Q;Fo(T)/(/c + 1)}. We propose to modify the Sidak 
procedure using a two-step procedure where tr(^) is such that 

«(?!- A:)Fo(T)]y/('^-'=)" 



(3.11) FoiUik)) = Fo{t) 



1 — 1 — min< 1 



(A; + l)Fo(r) 



with tr{n) = oo. The right-hand side of (3.10) for this modified Sidak pro- 
cedure is less than or equal to 

ie Jo k=0 

n 1 

(3.12) < «E E lPo{X.. < r,xl;% < r < ^f} 

n 

= aY,Pe{X(k)<r<X^k+i)} 

k=l 

= aPe{X(i) < r}; 

see, for example, [13], page 497, for the first equality in (3.12). Thus, we see 
that our modified Sidak procedure controls FDR under independence. 
The right-hand side of (3.10) is less than or equal to 

(3.13) j:mr{k))Pe{xl;;^ < r < X^^ll^}, 
ie Jo fe=o 

which, for the modified Bonferroni procedure, is less than or equal to the first 
expression in (3.12). Thus, the FDR of the modified Bonferroni procedure is 
also less than or equal to aPg{X(^i^ < r} and, hence, is controlled; of course, 
it is controlled more conservatively than the modified Sidak procedure. 

We conducted a numerical study to investigate the extent of improve- 
ment offered by our modified Sidak procedure in controlling FDR over the 
modified Bonferroni and the original Bonferroni and Sidak procedures. We 
generated n = 100 dependent random variables Xi ~ N{fii, 1), i = 1, . . . , 100, 
with the same variance 1 and a common correlation p, and performed 100 hy- 
pothesis tests of fi = against fi> 0, each using first the Bonferroni critical 
value and then the Sidak critical value corresponding to a = 0.05. The value 
of Q was then calculated for each procedure by setting no of the ^j's to zero 
and the remaining /Xj's to a positive value 6. The FDR then was estimated 
by averaging the Q values over 5000 iterations. Thus, we have the simulated 
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Table 2 

Simulated values of the FDR of the Bonferroni and Siddk procedures and their 
modifications with a — 0.05 

Independent (p = 0) Dependent (p = 0.5) 

Bonferroni Sidak Bonferroni Sidak 



no 5 Original Modified Original Modified Original Modified Original Modified 



30 


0.5 


0, 


.0118 


0, 


,0150 


0, 


,0119 


0, 


,0167 


0, 


.0048 


0, 


.0167 


0, 


.0049 


0, 


.0332 




1.5 


0, 


,0045 


0, 


,0073 


0, 


,0045 


0, 


,0079 


0, 


.0006 


0, 


.0066 


0, 


.0006 


0, 


.0412 




2.5 


0, 


,0008 


0, 


,0019 


0, 


,0008 


0, 


,0022 


0, 


.0002 


0, 


.0054 


0, 


.0002 


0, 


.0412 


50 


0.5 


0, 


,0218 


0, 


,0259 


0, 


,0222 


0, 


,0276 


0, 


.0092 


0, 


.0307 


0, 


.0093 


0, 


.0493 




1.5 


0, 


,0103 


0, 


,0147 


0, 


,0106 


0, 


,0149 


0, 


.0015 


0, 


.0116 


0, 


.0015 


0, 


.0455 




2.5 


0, 


,0021 


0, 


,0031 


0, 


,0021 


0, 


,0033 


0, 


.0006 


0, 


.0093 


0, 


.0006 


0, 


.0441 


70 


0.5 


0, 


,0315 


0, 


,0349 


0, 


,0319 


0, 


,0359 


0, 


.0141 


0, 


.0488 


0, 


.0144 


0, 


.0667 




1.5 


0, 


,0187 


0, 


,0237 


0, 


,0189 


0, 


,0232 


0, 


,0034 


0, 


.0196 


0, 


.0034 


0, 


.0494 




2.5 


0, 


,0052 


0, 


,0061 


0, 


,0052 


0, 


,0061 


0, 


,0013 


0, 


.0154 


0, 


.0014 


0, 


.0463 


90 


0.5 


0, 


,0414 


0, 


,0423 


0, 


,0423 


0, 


,0434 


0, 


.0234 


0, 


.0734 


0, 


.0240 


0, 


.0903 




1.5 


0, 


,0351 


0, 


,0382 


0, 


,0359 


0, 


,0393 


0, 


,0108 


0, 


.0414 


0, 


.0111 


0, 


.0642 




2.5 


0, 


,0173 


0, 


,0180 


0, 


,0175 


0, 


,0189 


0, 


,0045 


0, 


.0311 


0, 


.0046 


0, 


.0554 




MaxSE 


0, 


,0028 


0, 


,0028 


0, 


,0028 


0, 


,0029 


0, 


,0020 


0, 


,0034 


0, 


,0021 


0, 


,0038 



FDR of the Bonferroni and Sidak procedures. We chose Fq(t) = 1/2 and 
similarly calculated the FDR of the modified Bonferroni and Sidak proce- 
dures corresponding to this r. Table 2 compares the FDRs of the Bonferroni 
and Sidak procedures and their modification for no = 30, 50, 70 and 90, p = 
(independent) and 0.5 (dependent), and for different values of 6. The last 
row of this table gives the maximum of the standard errors of the estimated 
(simulated) FDRs in each column. 

As we expected, the modified Sidak procedure provided the least conser- 
vative control of FDR under independence. Since the Bonferroni and Sidak 
procedures are relatively more conservative when the actual proportion of 
true null hypotheses is small, the idea of improving them using an estimate 
of no should work well in this situation. This idea is confirmed by our nu- 
merical study. Both modified Bonferroni and modified Sidak procedures are 
seen to control FDR much less conservatively than their unmodified versions 
under independence. In the dependent case, however, the idea of improving 
the Bonferroni and Sidak procedures may not work unless no is small and 
the dependence is weak. 

Having found more than one procedure that can control the FDR under 
independence (e.g., the Bonferroni, Sidak and their modifications), compar- 
ing them further in terms of power seems to be the next important objective. 
While the idea of power can be conceptualized in terms of Type II errors 
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(false negatives) in several different ways, extending it from single testing 
to multiple testing, one particular concept, which is the average power [i.e., 
:^E(S)], has been used in a number of recent papers to compare FDR- 
controlling procedures [4, 17, 19]. However, it is argued in [15] that since the 
FDR is a measure of false positives, it seems more appropriate to compare 
different FDR-controlling procedures using a similar measure in terms of 
false negatives, the FNR [8, 15]. It will be interesting to see how the differ- 
ent FDR-controlling procedures in this paper compare in terms of measures 
involving FNR under the same distributional setting. This will be carried 
out in Section 5 after deriving some results on FNR in the next section. 



4. Results on FNR for fixed Jq and Ji. We will derive in this section 
some results on FNR of a single-step procedure, analogous to those on FDR, 
again assuming a fixed configuration of true and false null hypotheses. First, 
we have the following lemma. 



Lemma 2. An explicit expression of FNR is 
FNRe(t;Jo,Ji) 



E 

ieJi 



Pe^{X,<t}-J2 



(4.1) 



P4X(i)<t}-^ 
ieJo - 



PeAXi<t}-J2 



i(j + 1) 



Making the same kind of arguments as we made before for the monotonic- 
ity property of the FDR, we notice that if X is stochastically increasing in 9, 
the FNR is increasing in {9i:i G Jq} for fixed {6i:i € Ji} and is decreasing 
in {9i:i £ Ji} for fixed {9i : i E Jq}. In other words, FNRg(f; Jq, Ji) decreases 
as 9i moves away from 9iQ for at least one i G Jq or at least one i G Ji, with 



(4.2) 



supFNRe(t; Jo, Ji) = FNR0o(t; Jq, Ji] 



Since, when 9 = 0q, X is exchangeable, the right-hand side in (4.2) reduces 
to 



(4.3) 



Fo{t) - E 



r^^Pe,{x[^^''^ <t,Xi<t} 



n 



The equality in (4.3) follows from (4.1); see also [13]. This gives the next 
main result of this article. 
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Theorem 3. //X is stochastically increasing in 0, then FNKg{t, Jq, Ji) 
decreases as 9i moves away from 9iQ for at least one « € Jo or for at least 
one i G Ji. Furthermore, if^ is exchangeable when 6 = 9q, then 

(4.4) supFNRe(t; Jq, Ji) = —Pe,M > 0}- 

e n 

Clearly, the FNR of a single-step procedure can be controlled at a level 
/3 under the condition stated in the above theorem by choosing a fixed t 
subject to the condition PooiA > 0} = Pg^{mmi^j Xi <t} < j3. If the depen- 
dence structure of X is not utilized, the equation FQ{t) = 13/n provides a 
Bonferroni-type choice for t. When X is known to be positively dependent 
so that the inequality PgolminjgjXj > t} > FQ{t) is true, with the equality 
holding under independence, Sidak-type t can be determined from the equa- 
tion -Fo(^) = 1 — (1 — Z?)^/"". These procedures can potentially be improved in 
terms of having better control of FNR by borrowing information from the 
XiS exceeding an appropriately chosen value r. 

The following theorem is a FNR analog of Theorem 2 that extends the 
inequality on FNR given by Theorem 3 from a single-step to a two-step 
procedure and suggests how to modify the above single-step FNR-controlling 
procedures. 



Theorem 4. Under the conditions stated in Theorem 2, the FNR of a 
two-step procedure with tr (fe) < t for all k = 0,1, ... ,n satisfies the inequality 



(4.5) 



FNRf\tr<T;Jo,Ji) 
<^o(r)EE^ 

i£Jl fc=l 



1-1 



Foitr{k)) \ 
Fo{t) J 



When r — > oo, k = n with probability 1 and the above inequality reduces 
to that given by Theorem 3 under independence with t = too{n). We modify 
the Sidak procedure using a two-step procedure with t-rik) < r satisfying 



(4.6) Foitrik)) = Fo{t) 



1 — min< 1 



[n 



-k+l)Fo{T) 



i/t 
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and tr{0) = —oo. For this modified Sidak procedure, 
FNRf (i,<r;Jo,Ji) 



n— 1 1 



(4.7) 



ieJi fc=i 

i6 Ji A;=0 
n— 1 

< /3E E ^ -'4")' < - ^ 4:1)} 

ieJ fc=o 

n-l 

= /3P4X(„)>t}. 

The second inequahty in (4.7) follows from the fact that ^0(7") < Fg^{T); for 
the first equahty, see [13]. Thus, the above modified Sidak procedure controls 
FNR under independence. 

The right-hand side of (4.5) is less than or equal to 

(4.8) E E F,{U{k))Pe{x\;% < r < x[;;^}. 

i£Ji k=l 

This is less than or equal to the right-hand side of the first inequality in 
(4.7), which is less than or equal to (3, if we choose t-r^k) < r satisfying 

(4.9) Fo(t.(fc))=min|Fo(r), ^^_^ff J . 

This gives us our FNR-controlling modified Bonferroni procedure, which is 
of course more conservative than the modified Sidak procedure in the sense 
that it allows less nondiscoveries. 



Remark 1. It is important to note that the above results on FNR 
have been developed with the idea of controlling false nondiscoveries of 
any set of true alternatives (or false nulls). However, one is often inter- 
ested in controlling false nondiscoveries of a prespecified set of true alterna- 
tives. These results can be easily modified in such a situation. Let 9i = On 
for some specified On > 9io, i € Ji- Assume that X is exchangeable under 
9 = 9i = (9ii, . . . , 6ni)- Then Theorem 3 can be modified to 

(4.10) supFNR(t;Jo,Ji) = — Pej^>0} 
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and Theorem 4 can be modified to 
FNRf (t,<r;Jo,Ji) 

where Fi is the common c.d.f. of Xi under On. The Bonferroni and Sidak 
procedures as well as their two-step modifications using critical values based 
on Fi will provide better control of FNR in this case than values based on 
Fo. 

We conducted a numerical study to investigate how well these different 
FNR procedures control FNR under a specified set of true alternatives. We 
noticed, as in the case of controlling FDR, that although both modified 
Bonferroni and Sidak procedures often control FNR much less conservatively 
than their unmodified versions, the modified Sidak procedure provides the 
best control of FNR. 

5. A numerical study. In this section we compare the different 
FDR-controlling procedures under independence discussed in Section 3 in 
terms of a concept of power that relates to the unbiasedness condition Sarkar 
[15] introduced. Since the FDR measures the expected proportion of incor- 
rect decisions, a good multiple testing procedure must ensure that it does not 
exceed the expected proportion of correct decisions. The quantity 1 — FNR, 
which Genovese and Wasserman [8] called the correct nondiscovery rate, is a 
measure of correct decisions. In situations where controlling false negatives 
is of primary importance, the FNR provides a measure of incorrect deci- 
sions with the corresponding measure of correct decisions being 1 — FDR. 
Whether we have a multiple testing procedure designed to control FDR or 
FNR, the inequality FDR + FNR < 1 represents a desirable property for any 
such multiple testing procedure. This is referred to as the unbiasedness con- 
dition of an FDR- or FNR-controlling multiple testing procedure. A natural 
way to compare different FDR- or FNR-controlling procedures would be to 
see how they perform in terms of a measure that refiects the strength of 
unbiasedness. This leads us to the consideration of the quantity 

(5.1) vre = 1 - FDRe - FNRg. 

It is also related to the idea of Genovese and Wasserman [8] , who suggested 
using l — TTg as a risk function to compare multiple testing procedures. This 
is our concept of power. 

We investigated how the different FDR procedures in Section 3 perform 
in terms of the aforementioned concept of power. We computed the FNR 
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and then the power 1 — FNR — FDR for the Bonferroni and Sidak procedures 
and their modified versions [with Fq{t) = 1/2] based on the normal data that 
have been simulated before for FDR calculations. These simulated powers 
are displayed in Figure 1. As we see from this figure, the modified Sidak 
procedure is often the most powerful under independence, especially, as one 
would expect, when the proportion of true null hypotheses is relatively small. 
The unmodified Bonferroni and Sidak procedures, not surprisingly, are prac- 
tically indistinguishable in terms of their power performance. One should, 
however, be cautious in interpreting this graph in the dependent case (par- 
ticularly, the upper right two panels), in light of Table 1, which indicates 
that the modified Bonferroni and Sidak procedures may fail to control FDR 
unless the dependence is weak and no is small. 

We should point out that the unbiasedness property of the single-step 
procedures, which is numerically seen to hold, can be theoretically proved 
easily from Theorems 1 and 3. However, a theoretical justification of the 
same property for the two-step procedures, which appears to be also true 
from Figure 1, is an interesting and a more challenging theoretical problem. 

Bonfer Mbon MSidik Sidik 



<J.5 1.0 15 2S 30 5 1.0 IS 21) 2 5 3 




delta 



Fig. 1. Comparison of Bonferroni and Sidak procedures with their modified versions in 
terms of 1 - FDR - FNR. 
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Also, the same concept of power could be used to compare different FNR- 
contr oiling procedures. 

6. Results on FDR and FNR under a mixture model. In this section, 
we present appropriate modifications to Lemmas 1 and 2 when a mixture 
approach is taken as in [7, 17]. We will, however, assume a slightly more 
general mixture model in the sense that it does not assume independence 
of the test statistics. More specifically, we first let H = {Hi, . . . ,Hn), with 
Hi = indicating that Hi is true and Hi = l indicating that it is false. Then 
we assume that {Xi,Hi), i = 1, . . . ,n, have the distribution 

X|H ~ /(x, 0n) where On = {Oh, , . . . , ^hJ, % = (1 - Hi)e[ + HiO'l, 

with e[<eio,e'l>eio,i = i,...,n, 

and H ~ vTh, where vTh are some probabilities defined on 
7i = {h=(/ii,...,/i„):/ii=0 or 1}. 

(6.1) 

Regarding /, we assume that it belongs to a location family of distribu- 
tions; that is, /(x,^h) = /(x — ^h), with a positive dependence structure 
that ensures that, for any increasing (or decreasing) function of X, the 
expectation £^{</)(X)|Xj,H} is increasing (or decreasing) in Xi. This is true 
if, for instance, X is positive regression dependent on subset (PRDS) under 
the density /(x), as in the case of multivariate normal with positive corre- 
lations and many other multivariate distributions encountered in multiple 
testing; see, for example, [5, 14]. Of course, when (Xj,i/j), i = 1, . . . ,n, are 
independent, we assume no particular form for the density /; that is, we 
simply assume that Xi\Hi ~ f{x,9Hi)- Since we assume that 9i takes the 
value O'j^ when Hi = and the value 6" when Hi = 1, the probabilities in the 
following discussion are all evaluated under these fixed 6' = {9'i, . . . ,0'^) and 

e" = ie'i,...X)- 



Theorem 5. Under the above mixture model and the conditions assumed 
therein, 



(6.2) FBR{t,n)<J2^^P{H^ = 0\Xi>t}, 

where 



i=l 



"-iPlX.^.s*^ >t,Xi>t} 

6i = P{X,>t}-J2 , r—^ and 

(6.3) „ + 

f](5. = P{i?>0}, 
with the equality holding when the {Xi,Hi)^s are independent. 
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When {Xi,Hi), i = 1, . . . , n, are identically distributed, Theorem 5 reduces 

to 

(6.4) FDR(t, n) < P{Hi = 0\Xi > t}P{R > 0}. 

The equahty in (6.4) holds when {Xi,Hi), i = 1, . . . ,n, are i.i.d., which is 
Storey's [17, 18] result, providing a "Bayesian Type I error rate" interpreta- 
tion to his notion of pFDR = FDR/i-*{i? > 0}. Thus, the following corollary 
to Theorem 5 is an extension of his result to the dependent case. 

Corollary 1. Under the above mixture model and the conditions as- 
sumed therein, 

(6.5) pFDR(f, n) < max P{Hi = 0\Xi > t}. 

l<i<n 

When the {Xi,Hi)^s are identically distributed, we have 

(6.6) pFDR{t,n)<P{Hi = 0\Xi>t}, 
with the equality holding when the {Xi,Hi)^s are i.i.d. 

Storey [17] introduced a pFDR analog of the p-value, called the g-value, 
that provides a measure of the strength of the tests in a multiple testing 
procedure with respect to pFDR. For a single-step multiple testing procedure 
of n hypotheses with a rejection region of the form Xi>t for each Hi, it is 
defined as 

(6.7) qnit) = mipFDR{x,n). 

x<t 

Storey [17], however, considered this quantity when {Xi,Hi), i = l,...,n, 
are i.i.d., which is 

(6.8) q{t, Hi) = ini P {Hi =0\Xi>x}. 

x<t 

Corollary 1 says that when the (Xj,iJj)'s are dependent with common 
marginals, in the sense assumed in that corollary, we have qn{t) < Qit,Hi). 
That is, the (/-value of a single-step multiple test procedure obtained under 
certain commonly encountered types of dependence is more conservative, as 
one would want, compared to the corresponding i.i.d. case. 

Theorem 6. Under the conditions stated in Theorem 5, 

n 

(6.9) Ff^R<J2l^P{H^ = l\Xi<t}, 

i=l 
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where 



7i = P{Xi <t}-Y, ^V— Tl 



(6.10) ^ i=i + 

i=l 

with the equality holding when the {Xi,Hi)^s are independent. 

This theorem can be proved foUowing arguments similar to those used to 
prove Theorem 5 and with the help of an identity for P{A > 0) given by 
Sarkar [13]. 

Corollary 2. Under the conditions stated in Theorem 5, 

(6.11) pFNR < max P{Hi = l\Xi < t}. 

l<i<n 

When the {Xi,Hi)^s are identically distributed, we have 

(6.12) pFNR<P{i7i = l|Xi <t}, 
with the equality holding when the {Xi^HiYs are i.i.d. 

7. Proofs. 

Proof of Lemma 1. The FDP is given by 

n—l -j^ 

(7.1) Q{t;Jo,Ji)=Y.Y. :I{R = n-j,Xi>t}. 

«eJo j=0 

Since {R = n — j} = {^(j) <t < X(^j_^i-^}, with X^g) = —oo and X(„_,_]^) = oo, 
we have 

{R = n- J, Xi >t} = {X\^^ < t < Xljil^ ,X,>t}. 

Therefore, 

n— 1 

Qit- Jo, Ji) = E E -^i^O) < * ^ ^0+1)'^^ > i} 
ie Jo j=o J 

n— 1 1 

(7.2) = E E [^{^(7+1) >t^^^>t}- >t,x,>t}] 



i&Jo j=0 

n-1 T(v(~i) 



I{X}., ' >t,Xi>t} 

I ^- J ^ (n-j)(n-j + l) 
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Taking the expectation in (7.2), we get the first expression of the FDR in 
Lemma 1. The second expression follows from the fact that Q reduces to 
I{R > 0} = > t} if we consider the first summation in (7.2) over all 

i G J. □ 



Proof of Theorem 2. First note that 



(7.3) 



FDRf (t,>T;Jo,Ji) 

n 

= J2 Ee{Q{tr{k); Jo, Ji)I{X^k) <r< 

k=0 



Since tr(^) ^ for all k, when k = n (i.e., when ^(n) < ''")) there is no 
rejection of null hypotheses, implying that Q = 0. 

Let Fq^{x) and fe^ix), respectively, be the c.d.f. and the density of Xi un- 
der any alternative 0j for i = 1, . . . , n. Since the Xj's are assumed to be inde- 
pendent, the conditional expectation of Q{tr{k); Jq, Ji), given {X^^f.-^ < t < 
for /c = 0, 1, . . . , n — 1, is the FDR of the single-step procedure based 
on n — k independent random variables Yi, . . . ,Yn-k with Yi ~ fg.{x)I{x > 
T)/Fe^{T) and critical value trik). Since the density of 1^ has the MLR prop- 
erty, implying that (Yi, . . . ,Yn-k) is stochastically increasing, we have from 
Theorem 1 that this conditional expectation is 



< 



(7.4) 



no{T) 
n — k 
rao(r) 

n — k 



Peo(, max Y>Uik)] 
Fo{r) J , 



where no(T) = X^ieJo ^i-^i — ''")• Going back to (7.3), we then have 



n-l 

fc=OiGJo 



1 

n — k 



1 



1 



Fo{t) 



(7.5) 



X I{Xi > T,X^k) <r< ^(fc+i)} 



n-l 



EE 

iG Jo k=0 



1 



n — k 



Fo{r) 



xPe{X,>r,xl~;UT<xl~l[^}, 



(fe+i)J 



which is the required inequality in Theorem 2. □ 
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Proof of Lemma 2. The FNP is given by 



n ^ 



ieJij=i 



n ^ 



r, -1=1 J 

(7.6) 



iG Ji i=i 

n-l TS 



ieJi iGJi i=i •^'-•^ "'^^ 



Taking the expectation of (7.6), we get the first expression of the FNR. The 
second expression follows from the fact that 



N{t] Jo, Ji) = I{A > 0} - I{UI{A > 0)/A} 



n 



(7.7) 



< t} - E E -^{^ = < □ 

je Jo 3=1 



Proof of Theorem 4. We have 
FNRf (t,<r;Jo,Ji) 

n 

= E i?e{[iV(t,(fc); Jo, Ji)|X(fc) < r < 
fc=i 

(7 8) X I{X{k)<'r<X(^k+i)}} 

Fo{t) J 



^EE^^ 

ig Ji k=l 



xPe{X,<r,xl;%<r<xl;;^}y 



The inequality in (7.8) follows from Theorem 3, noting that the conditional 
expectation of N(tr{k); Jo, Ji), given {-^(fc) < t < is the FNR of the 

single-step procedure based on independent Zi, . . . ,Zk with Zj ~ /g- {x)I{x < 
T)/Fg.{T). The required inequality in Theorem 4 then follows from (7.8) 
because Fq^ (r) is decreasing in Oi for i G Ji . □ 
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Proof of Theorem 5. Since V = YA=iI{Xi > t)I{Hi = 0), we first 
note from Lemma 1 tliat the FDR under the mixture model is given by 



(7.9) 



FDR(t,n) = ^£;H 

i=l 
n 

= E 

1=1 

n 

= E 



P{X, > t\H, = 0} 

^ PiX^J^^ > t, Xi > t\U with Hi = 0} 



E 



{n-j){n-j + 1) 



i=\ 



P{Xi>t,H^ = 0}-Y^ 
P{Xi>t,H^ = 0} 



n-ip{x[^^'^ >t,X,>t,Hi = {)} 



{n- j){n- j + 1) 



1-E 



{n- j){n- j + 1) 



We now prove that 



(7.10) ^{^0)'^ > t\Xi >t,Hi = 0}> P{X\..^^ > t\X, > t} 

under the assumed positive dependence condition of the density / of X. 
Let tl^iXi) = > t\Xi,6i = 0}. Then the conditional probability 

> t\Xi > t,9i} can be written as 



(7.11) 



E{ij{X^)IiXi > t - di)] 

E{i{x,>t-ei)} 



with the expectations taken with respect to Xi under 9i = 0. Note that 
tp{x) is an increasing function of x under the assumed positive dependence 
condition of /. Also, I{x > t) is a totally positive of order two (TP2) function 
of (x,t) (see, e.g., [11]). Therefore, the ratio 



(7.12) 



E{i;{X,)I{Xi>t)} 
E{I{X,>t)} 



is increasing in t, because it is the expectation of an increasing function of 
a random variable whose distribution is stochastically increasing in t. This 
proves that P{x\~^^ > t\Xi >t,Hi = 0}> P{x\j^ > t\Xi >t,Hi = 1}, im- 
plying that the probability > t\Xi > t}, being a convex combination 
of P{xl^)'^ > t\Xi >t,Hi = 0} and -P{^q^*'' > t\Xi >t,Hi = 1}, is less than 

or equal to > t\Xi >t,Hi = 0}. Thus the required inequality (7.10) 

follows. 
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Applying (7.10) to (7.9), we get the inequality (6.2) to be proved in the 
theorem. The fact that 



follows from [13]. Furthermore, it is clear that the equality in (6.2) holds 
under independence of {Xi,Hi). Thus, the theorem is proved. □ 

8. Concluding remarks. We have obtained in this article some theoreti- 
cal results that extend previous work done under the assumption of indepen- 
dent tests. Two of these set the stage for developing our idea to modify the 
FDR- and FNR-controlling Bonferroni and Sidak procedures and obtaining 
wider families of FDR- and FNR-controlling procedures. We developed this 
idea by extending inequalities for FDR and FNR under independence from 
single-step to two-step procedures. In the case of the Bonferroni procedures, 
it is somewhat similar to what Storey, Taylor and Siegmund [19] used to 
modify the FDR-controlling BH procedure (which is, of course, a stepwise 
procedure) under independence. In the case of Sidak procedures, however, it 
is stronger in that we consider modifying less conservative procedures. It is 
important to point out that modifying the Sidak procedure by simply finding 
t that controls {nQ/n){l — Fq^I)} (the estimated maximum FDR, which is 
basically the idea in modifying the FDR-controlling Bonferroni procedure), 
does not seem to provide much improvement to the Sidak procedure. The 
same is true for the FNR-controlling Sidak procedure. This is what we have 
noticed based on additional simulations not reported here. Also, as is seen 
from Table 2, we need to be cautious using the present modifications when 
there is too much dependence in the tests; they may become anticonser- 
vative. Procedures that control FDR are different from those that control 
FNR. It will be interesting to see if procedures that control both FDR and 
FNR can be developed using the results discussed in this paper. 
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